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A simple  evolutionary  model  is  introduced  for  neural  development  along  the  lines 
of  the  Bak-Sneppen  model  for  biological  evolution  of  an  ecology.  The  model  rep- 
resents a set  of  neurons  and  their  connections  together  with  associated  synaptic 
weights.  Evolution  of  the  system  is  studied  for  different  model  fitness  functions 
of  the  synaptic  weights.  The  model  systems  exhibit  Darwinian  evolution  of  the 
synaptic  weight  space  towards  maturation. 


1 Introduction 


Until  recently  the  biological  contribution  to  learning  was  thought  to  be  an  unfolding 
according  to  an  intrinsic  schedule  1,2 . However,  evidence  now  indicates  that  the 
developing  cerebral  cortex  is  largely  free  of  domain-specific  structure  3,4  and  the 
representational  properties  of  the  cortex  are  built  by  the  nature  of  the  problems 
confronting  it. 

At  the  neurobiological  level,  learning  stems  from  the  interaction  between  in- 
trinsic growth  and  environmentally  derived  activity  4.  Two  factors  that  are  fun- 
damental for  this  interaction  are  selection  and  variance.  Selection  5’6,7,8  has  two 
distinct  stages.  The  first  stage  constructs  “pre-representations”.  The  second  stage 
selectively  eliminates  certain  of  these  representations.  The  most  fit  representations 
survive  to  underlie  mature  skills. 

Variation  is  important  for  the  development  of  a maximally  flexible  representa- 
tion capacity.  The  florid  growth  of  neural  tissues  in  ontogeny  4:  synapses;  axonal 
and  dendritic  arborisation  represents  one  possibility  for  introducing  a significant 
chance  element  to  learning.  Representations  are  consolidated  by  a gradual  increase 
in  the  synaptic  weights  of  preferred  units  (synaptic  weight  space).  The  widely 
accepted  mechanism  for  such  consolidation  is  Hebbian  learning  via  positive  rein- 
forcement. At  the  neural  net  level,  Hebbian  learning  follows  from  the  covariance 
of  pre-  and  post-synaptic  discharges.  At  the  circuit  level,  it  involves  synergy  of 
oscillators  involved  in  common  function.  Synapses  which  do  not  participate  in  the 
maintenance  of  circuits  are  eliminated. 

Selectivity  and  variability  are  also  important  aspects  of  biological  evolution. 
One  of  the  simplest  models  for  biological  evolution  is  the  Bak-Sneppen  model  9. 
In  this  model  evolutionary  activity  is  simulated  through  random  mutations  of  the 
least  fit  species  and  its  neighbours.  A characteristic  feature  of  this  model  is  that  it 
exhibits  Self- Organized  Criticality  whereby  the  system  evolves  through  a succession 
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of  punctuated  equilibria  to  a state  where  almost  all  species  have  fitness  above  a 
threshold  level. 

In  this  paper  we  have  introduced  new  variants  of  the  Bak-Sneppen  model  to 
investigate  the  evolution  of  synaptic  weight  space  towards  maturation  in  simple 
models  for  neural  development. 

2 Neural  Connectivity  Model 

Consider  a set  of  n units  (neurons)  labelled  1,2,3 on  a periodic  one- 
dimensional lattice,  i.e.,  units  k + 1 and  k — 1 are  neighbours  of  unit  k which 
is  equivalent  to  unit  n + k.  Associated  with  each  unit  k we  identify  a set  of  c(k) 
connections  (synapses) . Two  versions  of  the  models  are  studied.  One  in  which  the 
number  of  connections  is  kept  fixed  ( c{k ) = m)  and  the  other  in  which  the  number 
of  connections  (for  the  least  fit  unit  and  its  two  neighbours)  is  selected  at  random, 
(c(fc)  € [1, 7TT,])  at  each  update  step.  We  have  in  mind  that  the  former  situation  will 
be  easier  to  investigate  theoretically  whereas  the  latter  will  be  more  representative 
of  a model  neural  system.  Let  Cj(k)  denote  the  jth  connection  associated  with  unit 
k and  identify  a corresponding  synaptic  weight  a < Wj(k ) < b.  Pre-representations 
are  introduced  by  choosing  the  synaptic  weight  for  each  connection  at  random  from 
a uniform  distribution  in  the  range  [a,  6].  Initially  we  suppose  that  the  system  is 
fully  connected,  i.e.,  initially  c(k)  = m for  all  k.  There  is  no  discernible  difference 
in  the  long  term  results  whether  or  not  the  system  is  initially  randomly  connected. 
Selection  and  variation  in  the  system  is  modelled  by  replacing  the  least  fit  unit 
and  its  two  neighbours  by  new  units  with  new  randomly  assigned  connections  and 
weights. 

Three  different  models  have  been  examined.  In  Model  A the  weights  are  random 
numbers  in  the  range  [0, 1]  and  the  fitness  of  a unit  is  defined  as  the  minimum  weight 
for  that  unit.  In  Model  B the  weights  are  random  numbers  in  the  range  [—1,1]  and 
the  fitness  of  a unit  is  defined  as  the  average  of  the  weights  for  that  unit.  In  Model 
C the  weights  are  again  random  numbers  in  the  range  [-1,1]  but  the  fitness  is 
defined  as  the  sum  of  the  weights.  The  choice  of  the  range  [-1,1]  in  the  latter  two 
models  is  to  include  the  possibility  of  both  excitatory  and  inhibitory  synapses.  In 
each  model  the  steps  in  the  update  procedure  are  as  follows: 

1.  Set  up  the  pre- representations  which  are  fully  connected  with  random  synaptic 
weights. 

2.  Calculate  the  fitness  for  each  unit  and  identify  the  unit  with  the  lowest  fitness. 

3.  For  the  least  fit  unit  and  each  of  its  neighbouring  units  reset  the  number  of 
connections  depending  on  the  variant  of  the  model  and  assign  new  weights  at 
random  from  a uniform  distribution  on  [a,  b]. 

4.  Return  to  step  2 and  continue  for  N updates. 

Model  A with  m = 1 is  the  Bak-Sneppen  model.  The  transient  behaviour  of 
this  model  is  characterized  by  punctuated  equilibria  as  the  system  evolves  towards 
a statistically  stationary  state  in  which  the  density  of  weights  in  the  system  with 


79 


1.0 


0.8 


% 0.6 
I 

_u 

I 0.4 

B 


0.2 


20  40  60  80  100 

Unit 

Figure  1.  Snapshot  of  the  stationary  state  for  Model  A with  m = 2 and  fixed  numbers  of  connec- 
tions. 


w < w*  vanishes.  The  critical  weight  w*  ~ 0.66702  ±0.00003  10  is  referred  to  as  the 
self-organized  threshold.  A plot  of  the  largest  value  of  the  minimum  weight  after 
s updates  versus  s reveals  a staircase  structure  where  the  average  length  of  a step 
in  the  staircase  scales  as  a power  law  distribution.  The  change  in  the  minimum 
weight  across  a level  step  in  the  staircase  is  referred  to  as  an  avalanche  9. 

3 Simulations 

Simulations  of  the  models  have  been  carried  out  for  n = 100  units  and  s = 106 
updates  over  a range  of  values  of  the  maximum  number  of  connections  m € [1, 210]; 
both  for  fixed  number  of  connections  and  for  random  numbers  of  connections  (up  to 
the  maximum  m).  To  facilitate  the  discussion  of  these  results  let  /TO(s)  denote  the 
fitness  of  the  least  fit  unit  after  the  sth  update  in  a model  where  m is  the  maximum 
number  of  connections. 

In  all  cases  the  transient  behaviour  exhibits  punctuated  equilibria  and  the  long 
term  behaviour  is  highly  correlated  with  almost  all  fitness  values  above  a critical 
value.  Figure  1 shows  the  weights  for  each  unit  plotted  against  the  unit  number  in 
the  ‘stationary  state’,  after  104  updates,  in  Model  A with  m = 2 and  the  number 
of  connections  fixed.  Except  for  the  localized  avalanche,  all  weights  are  above 
the  limiting  threshold  weight.  Similar  long  term  trends  are  found  in  all  cases 
independent  of  m (compare  for  example,  Figure  1 of  11  which  shows  a similar 
plot  for  the  original  Bak-Sneppen  model),  however  the  magnitude  of  the  limiting 
threshold  fitness  is  m dependent. 

The  transient  behaviours  with  their  characteristic  punctuated  equilibria  are 
shown  for  each  of  the  models  in  Figures  2a, 2b, 3a, 3b, 3c.  These  figures  show  plots  of 
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Figure  2.  Devil’s  staircase  structure  for  Model  A;  (a)  fixed  numbers  of  connections,  (b)  random 
numbers  of  connections. 


the  maximum  value  of  the  minimum  fitness  after  s updates, 

A (m,s)  = max  fm(s), 

[0,s] 

versus  s.  In  each  horizontal  step  of  the  staircase  structure  in  these  figures  the 
minimum  fitness  remains  less  than  this  threshold  value.  A rise  in  the  staircase 
occurs  when  the  minimum  fitness  exceeds  this  threshold  value.  The  case  of  Model 
B with  fixed  numbers  of  connections  was  not  simulated  because  this  is  the  same 
as  Model  C with  fixed  numbers  of  connections  apart  from  the  scale  factor  m (see 
further  comments  below). 

In  Models  A and  B the  limiting  threshold  fitness, 

A *(m)  = lim  max/m(s), 

s— >oo  [0iS] 

decreases  with  an  increase  in  the  maximum  number  of  connections  m.  In  Model  C 
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Figure  3.  Devil’s  staircase  structure;  (a)  Model  B with  random  numbers  of  connections;  (b)  Model 
C with  fixed  numbers  of  connections,  (c)  Model  C with  random  numbers  of  connections. 
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Figure  4.  Plot  of  the  limiting  threshold  fitness  versus  the  maximum  number  of  connections  m:  (a) 
Model  A with  fixed  numbers  of  connections;  (b)  Model  A with  random  numbers  of  connections; 
(c)  Model  B with  random  numbers  of  connections;  (d)  Model  C with  fixed  numbers  of  connections; 
(e)  Model  C with  random  numbers  of  connections. 

the  opposite  is  true.  Figure  4 shows  plots  of  the  limiting  threshold  fitness  A*  (to) 
versus  to  for  each  of  the  models.  The  error  bars  in  the  plots  are  based  on  data 
from  ten  runs  of  each  model  and  the  lines  are  ensemble  averages.  From  these  plots 
we  note  that  for  large  to:  A*  (to)  nu  -i-  - for  Model  A;  A *{m)  ~ - for  Model 

B;  and  A*  (to)  ~ y/in  - for  Model  C.  For  Model  A the  limiting  value  A*(l)  is  the 
Bak-Sneppen  self-organized  criticality  threshold  value  0.66702  — For  models  B 
and  C the  limiting  value  A*(l)  ~ 2 x 0.66702  — 1 = 0.33404. 

4 Scaling  Analysis 

The  functional  relationships  between  the  limiting  threshold  fitness  (the  self- 

organized  threshold)  and  the  maximum  numbers  of  connections  can  be  obtained 

using  simple  probabilistic  arguments.  To  this  end  (following 4 * * * * * * *  12)  we  first  define  the 

avalanche  probability  function,  F’a(to)(s)  as  the  probability  for  an  avalanche  with 

threshold  fitness  A (to)  which  starts  at  k = 0 to  end  at  k = s.  It  follows  immedi- 
ately from  the  definition  that  F>A(m)(0)  = 0.  In  Model  A,  the  probability  for  the 
avalanche  to  end  at  step  k + 1 is 

PxA(m)(k  + !)  = (!-  - A (1) 

The  first  factor  on  the  right  hand  side  is  the  probability  that  the  avalanche  did 

not  stop  at  the  earlier  step  k and  the  factor  (1  — Ayt(TO))3m  is  the  probability  of 
independently  selecting  new  fitness  values  at  random  from  a uniform  distribution 
for  each  of  the  3 to  connections,  for  the  least  fit  unit  and  its  two  neighbours,  above 
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the  threshold  fitness  Ay^m).  It  is  a straightforward  exercise  to  obtain  the  solution 
to  the  difference  equation,  Eq.(l).  The  solution  for  initial  condition  P\A(m)(fy  = 0 
is 

W‘)  = - A„(m))“).  (2) 

At  the  limiting  threshold  values  A^(m)  we  anticipate  that  the  avalanche  proba- 
bilities P\*  (m)(fc)  are  independent  of  m.  In  particular  by  equating  P\‘A(m){k)  — 
P\>  ( i)(k)  we  obtain  the  relation 

Ai(m)  = l-(1-Ai(l))*,  (3) 


where  A^(l)  is  the  limiting  threshold  fitness  for  the  Bak-Sneppen  model.  The 
agreement  between  Eq.  (3)  and  the  values  plotted  in  Figure  4 is  summarized  in 
Table  1. 

We  now  consider  the  scaling  relation  for  model  C.  In  this  case  the  avalanche 
probabilities  are  determined  by  the  equation 

Pxc(m)(k  +!)  = (!-  PAc(m)(*))(l  ~ = ^))3  (4) 


where 


$(x)  = 


dz. 


The  second  factor  on  the  right  hand  side  of  Eq.(4)  follows  from  the  Central  Limit 
Theorem.  For  sufficiently  large  m,  the  sum  Y(m)  of  m random  numbers  from  a 
uniform  distribution  with  zero  mean  and  variance  a2  is  a Gaussian  random  variable 
Z with  zero  mean  and  variance  mcr2.  Hence  the  probability  of  randomly  selecting 
the  sum  Y (m)  > A c{m)  is  equal  to  the  probability  of  selecting  the  Gaussian  random 
variable  Z > • This  probability  is  given  by 


1 _$ 


The  solution  of  Eq.(4)  for  initial  condition  c(m)(0)  = 0 is 


c(m)  (&) 


i+a-^(W))s 


( *c{m)\  3k 

V Mo  ) 1 


(5) 


The  avalanche  probabilities  given  by  Eq.(5)  are  approximations  based  on  the  Cen- 
tral Limit  Theorem  which  holds  with  increasing  accuracy  as  m increases.  The  ap- 
proximation is  already  reasonable  for  m > 2 but  it  does  not  hold  in  the  case  m = 1 
where  the  sum  over  m random  numbers  from  a uniform  distribution  on  [—1, 1]  is 
simply  the  uniform  random  variable.  In  this  case  the  probability  for  choosing  a ran- 
dom variable  greater  than  Ac(l)  is  (1  — Ac(l))/2  so  that  the  avalanche  probability 
is  given  by 


P\c(i)(k) 


c- 


-Ac(l)  ^3 


l + (- 


-*c(l) 


(l-(-l)fc( 


1 - Ac(l) 


)")■ 


2 


(6) 
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Table  1.  Comparison  between  limiting  threshold  fitness  values;  A (m)  obtained  from  the  numerical 
simulations,  and  A (m)  obtained  from  the  theoretical  scaling  relations  Eqs.(3,7).  The  percentage 
difference  between  these  values  is  also  shown. 


B 

A *A(m) 

A^(m) 

% diff. 

A *c(m) 

K cW 

% diff. 

i 

0.67789 

0.67789 

0.00 

0.35751 

— 

— 

2 

0.43259 

0.43245 

0.03 

0.39351 

0.39351 

Kl 

4 

0.24769 

0.24664 

0.42 

0.55863 

0.55650 

H9 

8 

0.13330 

0.13203 

0.95 

0.77354 

0.78701 

1.74 

16 

0.06815 

0.06835 

0.29 

1.07521 

1.11300 

3.51 

32 

0.03501 

0.03478 

0.65 

1.52210 

1.57402 

3.41 

64 

0.01774 

0.01754 

1.13 

2.17175 

2.22601 

128 

0.00884 

0.00881 

0.36 

3.01230 

3.14805 

4.51 

256 

0.00442 

0.00441 

0.29 

4.29448 

4.45201 

3.67 

512 

0.00222 

0.00221 

0.53 

6.09033 

6.29610 

3.38 

1024 

0.00111 

0.00110 

0.47 

8.49563 

8.90402 

4.81 

The  limiting  threshold  value  for  m = 1 in  Model  C is  given  by  Ajb(l)  = 2A^(1)  - 1. 
To  obtain  the  scaling  behaviour  with  m of  the  limiting  threshold  values  A *c{m)  for 
m > 2 we  now  equate  P\-C(m){k)  = P\*c(2){k)  which  yields 


A him)  = 


(7) 


The  good  agreement  between  the  A£(m)  values  plotted  in  Figure  4 for  m > 2 and 
Eq.(7)  is  summarized  in  Table  1. 

The  scaling  behaviour  in  the  case  of  model  B can  be  obtained  by  repeating  the 
analysis  for  model  C but  replacing  $ by  $ j since  the  average 

of  m random  numbers  from  a uniform  distribution  with  zero  mean  and  variance 
a2  is  a Gaussian  random  variable  with  zero  mean  and  variance  a2.  This  yields  the 
scaling  result 


A b(to)  = 


(8) 


5 Discussion 

The  simple  evolutionary  models  for  neural  development  introduced  in  this  paper 
attempt  to  model  the  chance  aspect  of  learning  via  Darwinian  evolution.  All  models 
exhibit  Darwinian  evolution  of  the  synaptic  weight  space  towards  maturation  where 
almost  all  neurons  have  fitness  levels  above  a threshold  value. 

It  is  anticipated  that  the  maximum  number  of  connections  scales  with  memory 
and  learning.  With  this  interpretation  the  ‘fitness’  functions  in  each  of  the  models 
needs  some  clarification  since  from  Figure  4 and  the  scaling  analysis  above  we 
see  that  only  Model  C exhibits  an  increase  of  fitness  with  increasing  memory  and 
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learning.  On  this  basis  the  fitness  function  in  Model  C is  more  representative 
of  learning  and  memory  than  the  fitness  function  in  Models  A and  B.  However  a 
fitness  function  that  scales  with  memory  and  learning  can  also  be  recovered  from  the 
present  fitness  function  in  models  A and  B simply  by  multiplying  by  an  appropriate 
monotonically  increasing  function  of  m.  For  example  multiply  by  to7  where;  7 > 1 
for  Model  A,  7 > 1/2  for  Model  B,  and  7 > 0 for  Model  C. 

It  is  interesting  that  the  scaling  relations:  A *A(m)  ~ 1/m,  A *B(m)  ~ 1 j\fm  and 
A *c(m)  ~ ^/m  which  were  derived  in  the  theoretical  analysis  (and  obtained  numeri- 
cally) for  the  case  of  fixed  numbers  of  connections  are  also  obtained  in  our  numerical 
simulations  when  the  numbers  of  connections  is  selected  at  random  (for  the  least 
fit  unit  and  its  neighbours)  in  each  update.  With  the  appropriate  interpretation 
of  fitness  as  above,  for  a fixed  maximum  number  of  connections,  we  find  that  the 
neurons  self-organize  themselves  to  operate  at  increasing  fitness  whilst  at  the  same 
time  decreasing  the  numbers  of  active  connections.  The  scaling  analysis  of  this 
reduction  in  the  number  of  active  connections  with  increasing  fitness  is  deserving 
of  further  studies. 

Towards  the  end  of  this  study  we  became  aware  of  the  neuronal  model  of  self- 
organized  learning  recently  introduced  by  Chialvo  and  Bak  13 . Our  models  are 
similar  to  their  model  to  the  extent  that  memory  and  learning  is  consolidated  via 
Darwinian  elimination  of  the  least  fit  units  rather  than  via  Hebbian  re-inforcement 
of  the  most  fit  units.  On  the  other  hand  our  models  are  more  crude  than  the 
Chialvo-Bak  model  since  we  do  not  include  the  dynamics  of  firing  in  our  models. 

References 

1.  E.H.  Lenneberg,  Biological  Foundations  of  Language  (Wiley,  1967). 

2.  N.  Chomsky,  Behavioral  and  Brain  Sciences  3 1 (1980). 

3.  W.T.  Greenough,  J.E.  Black  and  C.S.  Wallace,  Child  Development  58,  539 
(1987). 

4.  S.  R.  Quartz  and  T.  J.  Sejnowski,  Behavioral  and  Brain  Sciences  20,  537 
(1997). 

5.  N.  Jerne  in  The  Neurosciences:  A Study  Program,  ed.  G.C.  Quarton,  T. 
Melnechuk,  F.O.  Schmitt  (Rockefeller  University  Press,  1967). 

6.  J.P.  Changeux  and  A.  Danchin,  Nature  264,  705  (1976). 

7.  P.  Rakic,  J.  P.  Bourgeois,  M.F.  Eckenhoff,  N.  Zecevic  and  P.  S.  Goldman-Rakic. 
Science  232,  232  (1986). 

8.  G.  M.  Edelman,  Neural  Darwinism.  The  Theory  of  Neuronal  Group  Selection 
(Basic  Books,  1987). 

9.  P.  Bak  and  K.  Sneppen,  Phys.  Rev.  Letts.  71,  4083  (1993). 

10.  M.  Paczuski,  S.  Maslov,  and  P.  Bak,  Phys.  Rev.  E 53,  414  (1996). 

11.  P.  Bak  and  S.  Boettcher,  Physica  D 107,  143  (1997). 

12.  S.  Boettcher  and  M.  Paczuski,  Phys.  Rev.  Letts  76,  348  (1996). 

13.  D.  Chialvo  and  P.  Bak,  Neuroscience  90,  1137  (1999). 


